Analysis of Memory Hierarchy Performance of Block Data Layout
نویسندگان
چکیده
Recently, several experimental studies have been conducted on block data layout as a data transformation technique used in conjunction with tiling to improve cache performance. In this paper, we provide a theoretical analysis for the TLB and cache performance of block data layout. For standard matrix access patterns, we derive an asymptotic lower bound on the number of TLB misses for any data layout and show that block data layout achieves this bound. We show that block data layout improves TLB misses by a factor of O B compared with conventional data layouts, where B is the block size of block data layout. This reduction contributes to the improvement in memory hierarchy performance. Using our TLB and cache analysis, we also discuss the impact of block size on the overall memory hierarchy performance. These results are validated through simulations and experiments on state-of-the-art platforms.
منابع مشابه
Tiling, Block Data Layout, and Memory Hierarchy Performance
Recently, several experimental studies have been conducted on block data layout in conjunction with tiling as a data transformation technique to improve cache performance. In this paper, we analyze cache and TLB performance of such alternate layouts (including block data layout and Morton layout) when used in conjunction with tiling. We derive a tight lower bound on TLB performance for standard...
متن کاملEfficient Tree Layout in a Multilevel Memory Hierarchy
We consider the problem of laying out a tree with fixed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a root-to-leaf path, subject to a given probability distribution on the leaves. This problem was previously considered by Gil and Itai, who developed optimal algorithms when the block-transfer size i...
متن کاملEÆcient Tree Layout in a Multilevel Memory Hierarchy
We consider the problem of laying out a tree with xed parent/child structure in hierarchical memory. The goal is to minimize the expected number of block transfers performed during a search along a root-to-leaf path, subject to a given probability distribution on the leaves. This problem was previously considered by Gil and Itai, who developed optimal algorithms when the block-transfer size B i...
متن کاملComprehensive Hardware and Software Support for Operating Systems to Exploit
ÐHigh-performance multiprocessor workstations are becoming increasingly popular. Since many of the workloads running on these machines are operating-system intensive, we are interested in exploring the types of support for the operating system that the memory hierarchy of these machines should provide. In this paper, we evaluate a comprehensive set of hardware and software supports that minimiz...
متن کاملComprehensive Hardware and Software Support for Operating Systems to Exploit MP Memory Hierarchies
High-performance multiprocessor workstations are becoming increasingly popular. Since many of the workloads running on these machines are operating-system intensive, we are interested in what sort of support for the operating system should the memory hierarchy of these machines provide. This paper addresses this question. This paper shows that the largest performance losses for the operating sy...
متن کامل